Inversion of F0 model for natural-sounding speech synthesis

نویسندگان

  • Pierluigi Salvo Rossi
  • Francesco Palmieri
  • Francesco Cutugno
چکیده

Natural-sounding speech synthesizers requires the information from a model quantitatively describing prosody. Fujisaki’s model [1] has shown considerable accuracy on many languages [4][6]. We propose a method for Fujisaki’s model parameters estimation, i.e. an inversion methods, based on relative extremes of pitch contour and a gradient algorithm refinement procedure. Preliminary results show excellent performance of the proposed method in matching the pitch contours. Preliminary results of synthesis making use of obtained features are surely encouraging.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model of intonation for use with speech synthesis and recognition

This paper describes a synthesis from analysis scheme for producing natural sounding intonation for speech synthesis. The paper presents a new method of describing F0 contours in terms of three basic phonetic intonation elements. Details are given of an automatic system for labelling F0 contours, which could be used for speech recognition purposes. Current work on extracting a phonological desc...

متن کامل

Perceptual Foundations for Naturalistic Variability in the Prosody of Synthetic Speech

Recent studies have shown that the Tonal Center of Gravity is a better classifier than F0 Turning Points for at least two contrastively timed pitch accents in American English intonation contours. Within this framework, a binary F0 weighting function derived from the F0 contour can be used instead of the natural F0 contour without a degradation in discrimination performance. This success has im...

متن کامل

Applying a Hybrid into Seamless Speech

We present a speech synthesizer to seamlessly concatenate recorded and synthetic phrases to produce natural sounding and highly expressive speech. Not only the acoustic units, but also the F0 contours are seamlessly concatenated together from recorded and synthetic phrases. When mixed with recorded phrases, the F0 contours of synthetic phrases are generated adaptively relative to the actual sur...

متن کامل

Applying a Hybrid Inton Seamless Speech S

We present a speech synthesizer to seamlessly concatenate recorded and synthetic phrases to produce natural sounding and highly expressive speech. Not only the acoustic units, but also the F0 contours are seamlessly concatenated together from recorded and synthetic phrases. When mixed with recorded phrases, the F0 contours of synthetic phrases are generated adaptively relative to the actual sur...

متن کامل

A multi-layer F0 model for singing voice synthesis using a b-spline representation with intuitive controls

In singing voice, the fundamental frequency (F0) carries not only melody, but also music style, personal expressivity and other characteristics specific to voice production mechanism. The F0 modeling is therefore critical for a natural-sounding and expressive synthesis. In addition, for artistic purposes, composers also need to have control over expressive parameters of the F0 curve, which is m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003